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Abstract — The Resource Description Framework (RDF) is a semantic network data model that is used to create machine- 
understandable descriptions of the world and is the basis of the Semantic Web. This article discusses the application of RDF to 
the representation of computer software and virtual computing machines. The Semantic Web is posited as not only a web of data, but 
also as a web of programs and processes. 
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1 Introduction 

At its core, the Semantic Web is a global graph data 
structure used to describe web resources in a machine- 
understandable way [1J. Unlike the World Wide Web, in 
which document resources are interconnected through a 
single type of relationship (i.e. href, hyper-text links), 
on the Semantic Web, resources are related to one an- 
other through a heterogeneous set of relationships. The 
set of resources and relationship types are identified by 
Uniform Resource Identifiers (URI) [2J- The Resource 
Description Framework (RDF) is a standard for graphing 
(i.e. relating) URIs, literal values, and blank nodes (or 
anonymous nodes) |3|. If U is the set of all URIs, L is 
the set of all literal values, and B is the set of all blank 
nodes, then an RDF triple (or link) is defined as (s,p,o), 
where s e (U U B), p e U, and o e (U U L U B). The 
union of all triples constitutes the Semantic Web graph 
and can be generally defined as 

G C ((U U B) x U x (U U LU B)). 

At the level of RDF, the Semantic Web is simply a collec- 
tion of triples. These triples form a data structure known 
as a directed edge labeled graph (or multi-relational net- 
work). However, in order to create a layer of abstraction 
to describe how resources should be interrelated and to 
reason about and infer non-explicit relationship between 
resources, ontological languages have been developed. 
The two most prevalent Semantic Web languages are 
the RDF Schema (RDFS) [4] and the Web Ontology 
Language (OWL) |5J. For a fine, practical review of these 
two languages see 151 . 

The prevalent conception of the Semantic Web is 
that of a well-structured, massive-scale distributed data 
repository that can be utilized by applications for various 
purposes. However, the RDF data model is general 
enough to support not only the representation of data, 
but also the representation process. The purpose of this 
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article is to discuss the general use of rule and process 
information in the Semantic Web and their explicit real- 
ization as RDF encoded software programs and RDF vir- 
tual machines (RVM). The remainder of this section will 
introduce 1.) the Linked Data initiative and its intention 
of creating a massive-scale distributed data structure and 
2.) the RDF programming and virtual machine initiative 
and its intention of creating a massive-scale distributed 
process infrastructure. The latter initiative is a nascent 
movement which has the potential to greatly advance 
the utility of the Semantic Web and, as previously stated, 
forms the primary point of discussion for this article. 

1 .1 Linked Data as a Distributed Data Structure 

The Linked Data initiative is concerned with exposing 
data within the URI address space much like the World 
Wide Web initiative is concerned with exposing docu- 
ments and media in the URL address space [7J. Before 
discussing the Linked Data movement, it is important 
to understand how the Semantic Web serves not simply 
as a data repository, but more importantly as a sin- 
gle massive-scale distributed database. Moreover, it is 
important to discuss how the Semantic Web provides 
both a technological and cultural differentiation from 
the traditional notion of a database as posited by the 
relational database community. These factors set the 
Semantic Web up for being a revolutionary means by 
which data is globally managed and accessed. 

Technologically, the Semantic Web is reminiscent of the 
relational database model, insofar as it is a data storage 
environment that provides well-structured data to ex- 
ternal applications; though this data is not represented 
as a collection of interlinked tables, but instead as an 
edge labeled graph (more specifically, an RDF graph). 
While the table and graph data structures can be mapped 
into one another without loss of information, the utility 
of the graph structure has yielded the development of 
specialized graph databases known as triple stores |8]J^] 
Moreover, the data exposed by the Semantic Web is 

1. It is important to note that many RDF servers still utilize an 
underlying relational database to manage data. Examples of such 
architectures include the D2R Server |9]. 
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within the URI address space and as such is agnostic 
to the addressing scheme of the underlying machine 
supporting its representation. In this way, the data on 
multiple physical machines are able to reference each 
other and thus, the Semantic Web serves as a single 
unified graph spanning serves worldwide. 

Culturally, the Semantic Web maintains the open, glob- 
ally accessible nature of the World Wide Web. In contrast, 
rarely are relational database schemas reused and / or 
openly distributed and rarely are relational database 
ports (e.g. ODBC) made available for the public har- 
vesting of information. The common paradigm in the 
relational database world is that data is accessed and 
manipulated by software with privileges to the data 
and only through that software is the information made 
available to other services, if at all. However, with re- 
spect to the Semantic Web, not only does the community 
encourage the distribution and reuse of ontologie^j but 
it also provides open and accessible interfaces to the its 
data. Such interfaces are known as SPARQL endpoints 
IITOl and HTTP-based linked RDF data 0. The Semantic 
Web truly represents a new data management paradigm 
because of the way in which data is distributed and 
discovered: in an open, standards-based fashion. 

The Semantic Web's Linked Data community is fo- 
cused on the systematic union of RDF datasets in order 
to allow 

"[any man or machine] to start with one data 
source and then move through a potentially 
endless Web of data sources connected by RDF 
links. Just as the traditional document Web 
can be crawled by following hypertext links, 
the Web of Data can be crawled by following 
RDF links. Working on the crawled data, search 
engines can provide sophisticated query capa- 
bilities, similar to those provided by conven- 
tional relational databases. Because the query 
results themselves are structured data, not just 
links to HTML pages, they can be immediately 
processed, thus enabling a new class of appli- 
cations based on the Web of Data." 1TTT1 
There is far-reaching potential for the Web of data that 
currently exists and will continue to grow to become. 
However, one of the limiting factors in the Linked Data 
approach is that while the community is providing a 
massive-scale distributed data structure, they are not 
providing a massive-scale distribute process infrastruc- 
ture to compute on this Web of data [12J. Without a 
distributed process infrastructure, Semantic Web appli- 
cations are left with the typical server / client-download 
philosophy of the World Wide Web. For data intensive 
algorithms, this is an inefficient use of resources as it 
requires the movement of large amounts of data to the 
algorithm's executing machine(s). It is this design choice 
that has made the World Wide Web (i.e. the web of 
HTML documents), at large, only accessible to those 

2. Schema Web available at |http: / / www.schemaweb.rnf o/ 1 



that have the processing power and space to download 
and index it j For the keyword search space of the 
World Wide Web, this problem is perhaps best solved 
by the few large-scale, search engines in existence today. 
However, the Semantic Web, with its rich data model 
and nearly endless potential, is poised to require a new 
Web infrastructure to support its processing within and 
between its various Linked Data repositories. No single 
institution or organization will have the compute power, 
nor the man power, to execute and implement all the 
potentially useful algorithms that will make the Seman- 
tic Web stand out as the defacto medium for representing 
data. In order to remedy this situation, a move towards a 
computing paradigm for the Semantic Web is necessary. 

1 .2 RVM Computing as a Distributed Process Infras- 
tructure 

The Semantic Web has the potential to not only act 
as a data storage environment, but also capture the 
more procedural aspects of computing, such as computer 
instructions and abstract virtual computing machines. In 
others words, given the flexibility of the RDF data model, 
it is possible to encode, in RDF, the rules by which RDF 
data is manipulated and thus, expose such information 
on the Semantic Web. Moreover, the URI address space 
is an infinite space that is only constrained by the size 
and number of physical machines that are supporting its 
representation. A flexible data model and an infinite ad- 
dress space make the Semantic Web an ideal medium for 
distributed, global computing. In this more computation- 
centric environment, instructions expressed in RDF are 
executed by RDF virtual machines (RVM). An RVM is 
any entity that processes RDF computing instructions, 
and in some instances, is represented in RDF as well. 
Thus, like other RDF data, computing instructions and 
RVMs are "first-class" citizens on the Semantic Web. 

Many common computing models are made salient by 
the RVM paradigm, such as open (refer to Section |4.1| , 
distributed (refer to Section |43J, and reflective comput- 
ing (refer to Section 4.3 1. RDF programming languages 
compile down to RDF and these RDF instructions can 
be accessed, annotated (i.e. RDF related), and reasoned 
on like any other RDF data on the Semantic Web. Fur- 
thermore, unique situations emerge when RDF code is 
represented across different physical machines. Because 
all RDF instructions are in the same URI address space, 
there is nothing that prevents the software, much like the 
data, to by physically distributed. With an RDF virtual 
machine executing RDF instructions, it is possible for the 
virtual machine and the instructions to be relocated by 
simply downloading the RDF subgraph that represents 
that virtual machine to another physical machine. Thus, 
instead of migrating large amounts of data to a local 

3. A distributed process infrastructure is a feature of the Grid com- 
puting paradigm that provides a democratization of compute cycles 
] 13 1 . With respect to the Semantic Web, systems like GridVine provide 
a means to efficiently query and update an RDF graph that is overlaid 
across multiple physical machines |14[. 
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environment for processing, the RDF virtual machine 
and the instructions that it is processing can be migrated 
to the remote environment. In this way, the process is 
moved to the data, not the data to the process. Finally, 
because RDF computing instructions and, in some cases, 
RVMs are represented in RDF, reflection is possible at 
the object, instruction, and machine-level. Thus, nearly 
the entire computing stack is exposed to reasoning and 
self-modifying processes. 

The structure of this article is as follows. Section [2] 
discusses RDF-based programming languages in general 
and one language specifically: Neno 1 15 1. Related work is 
also presented at the end of Section|2] Section|3]discusses 
the relationship between RDF computing instructions 
and RVMs and more specifically the Fhat and r-Fhat 
RVMs. Finally, Section [4] discusses those aspects of com- 
puting on the Semantic Web - open, distributed, and 
reflective computing - that are conveniently exposed by 
the use of RDF-based programming languages, RVMs, 
RDF computing in general. 

2 RDF-Based Programming Languages 

RDF is used to express facts about the world in a 
structured and machine understandable fashion. 
"The basic intuition of model-theoretic seman- 
tics is that asserting a sentence makes a claim 
about the world: it is another way of saying 
that the world is, in fact, so arranged as to 
be an interpretation which makes the sentence 
true. In other words, an assertion amounts to 
stating a constraint on the possible ways the 
world might be." [16 J 
However, RDF is more generally useful and need not 
be constrained to asserting facts about the "world". In 
this article, RDF is used to represent computational data 
structures such as software (i.e. a sequence of instruc- 
tions) and machine state (i.e. operand stacks, program 
counters, etc.). Common data structures such as lists, 
trees, and graphs in general are conveniently represented 
in RDF, as are programs of Turing complete languages 
[17 1 . This section focuses on one Turing complete RDF- 
based programming language called Neno [15 10 Other 
RDF-based programming languages include the func- 
tional, stack-based language Ripple [18^] the object- 
oriented FABL 1190 Adenosine^} and Adenine^ EU 
languages. These languages, along with RDF toolkits, 
RDF-to-object mappers, and Web-based rule languages 
will be discussed in the related work section. 



4. Neno/Fhat is currently available at http://neno.lanl.gov/ 



5. Ripple is currently available at http://ripple.fortytwo.net/ 

6. FABL is currently available at http:/ /fabl.net/ 

7. Unfortunately, Adenosine has no formal publications nor a cur- 
rently existing homepage. However, there are discussions of it on 
various Semantic Web mailing lists as well as a project homepage 
that is available through the Internet Archive. This article will briefly 
discuss its formalisms to provide a more complete picture of RDF- 
programming. 

8 Adenine is available at |http:/ /www.if cx.org/wiki/ Adenme.html| 



2.1 The Neno RDF Programming Language 

Neno is an imperative programming language that takes 
an object-oriented perspective on the resources of the Se- 
mantic Web |15[. In Neno, the human readable/ writable 
language's grammar is similar to popular object-oriented 
programming languages such as Java and C++. How- 
ever, as opposed to typical object-oriented languages, 
many of the constructs of the Neno language were de- 
signed to take advantage of the RDF data model and the 
standardized means by which RDF data in queried and 
modified (e.g. SPARQL flU and SPARQL/Update ET1 . 
respectively). The motivation for many of the language 
constructs is to overcome the impedance mismatch be- 
tween the typical object-oriented data model and the 
RDF data model. 

Neno source code is written by a human programmer 
and is compiled by the Neno/Fhat compiler. The com- 
pilation processes generates a Fhat API represented in 
OWL. A Fhat API denotes Neno classes, their respective 
methods, and each method's instructions. In this sense, 
a Fhat API is similar to the API of object-oriented lan- 
guages (e.g. the typical Java jar file). Classes in a Fhat 
API can be instantiated to active computational objects 
represented in RDF. These instantiated objects main- 
tain low-level computing instructions (e.g. add, set, 
branch, etc.) represented in RDF. These instructions 
denote computational primitives and specify the flow 
of execution within a method (refer to Section |3.2.1) . 
Figure [T] diagrams the the stages of processing required 
to go from human readable / writeable Neno source code 
to instantiated computational objects. 
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Fig. 1 . The various transformations from source code to 
computational object in Neno/Fhat. 

This sub-section will discuss the Neno programming 
language in particular and Section 3.2.1 will discuss the 
role of the Fhat RVM in the processing of compiled Neno 
source code. 

2.1.1 Neno Language Constructs 

The following code example presents a simple Neno 
class and will be referred to throughout the remainder 
of this section0 

9. For the sake of brevity, those operations that are typically found 
in other programing languages are not discussed in this section. For 
example, f or-looping, while-looping, and if /else branching have a 
similar syntax and behavior as found in other programming languages 
such Java and C. For an in-depth discussion of the Neno programming 
language constructs, please refer to 1151 . 
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prefix lanl: <\protect \vrule widthOptAprote 
prefix foaf: <\protect \vrule widthOpt\prote 

foaf:Agent lanl:Person { 
xsd:string f oaf : name [ 1 ] ; 
lanl:Person f oaf : knows [ ..*] ; 

makeFriend ( lanl : Person p) { 
this . foaf : knows =+ p; 

} 

makeEnemy ( lanl : Person p) { 
this . foaf : knows =- p; 

} 

makeAllEnemies ( ) { 
this . foaf : knows =/; 

} 

xsd:boolean isFriend (lanl :Person p) { 
return this . foaf : knows =? p; 

} 

} 

The lanl : Person class description is written in object- 
oriented syntax and is saved to a single text file denoted 
Person, neno. The class is composed of two fields and 
four methods and states that 

• lanl:Person is an rdf s : subClassOf of 
foaf :Agent (i.e. extends), 

• foaf :name is an xsd: string field that has a 
cardinality owl : Restriction of 1, 

• foaf : knows is a lanl : Person field that does not 
have a cardinality owl : Restriction, 

• makeFriend is a void method that takes a single 
lanl : Person as an argument, 

• makeEnemy is a void method that takes a single 
lanl : Person as an argument, 

• makeAllEnemies is a void method that takes no 
arguments, and 

• isFriend is an xsd : boolean method that takes a 
single lanl : Person as an argument. 

Fields in Neno are assumed to be unordered sets be- 
cause, for example, there may exist many foaf: knows 
relationship between two lanl: Person resources. As 
such, special set operators exist for interacting with Neno 
fields: 

• the set-plus operator (=+): adds (i.e. unions) a new 
value to a field. 

• the set-minus operator (=-): removes (i.e. set mi- 
nuses) an existing value from a field. 

• the set-clear operator (=/): removes all existing val- 
ues from a field. 

• the set-query operator (=?): returns a boolean spec- 
ifying whether the provided value currently exists 
in the field. 

• the set operator (=): removes all existing values and 
adds the provided values to the field. 



ct \Tyjpa(faJtotblj?ct-xb^iOT^ . gov } > ; 

cehcerfehfe{fieldp :ahd<irilethods)iof finadbjSctl F<M te&apipfe^xmlns . coi 
with respect to object fields, the statement 

lanl : marko . f oaf : knows . f oaf : name ; 

returns the xsd:string foaf: names of all resources 
that lanl : marko knows. That is, it first resolves all the 
lanl:Person instances that lanl:marko knows and 
then resolves all the xsd: string foaf : name values 
of those lanl:Persons. Neno also supports a non- 
standard "dot dot" notation that is used for inverse 
referencing. For example, the statement 

lanl : marko . .foaf: knows. foaf: name ; 

returns the names of all the resources that know 
lanl: marko. That is, it determines all the 
lanl:Persons that foaf:know lanl:marko and 
then returns the xsd: string of their foaf: name. 
The difference between "dot" notation and "dot dot" 
notation can be illustrated with two SPARQL queries. 
The first "dot" statement translates to the query 

SELECT ?y 
WHERE { 

<lanl:marko> <foaf:knows> ?x . 
?x <foaf:name> ?y } 

and the second "dot dot" statement translates to 

SELECT ?y 
WHERE { 

?x <foaf:knows> <lanl:marko> . 
?x <foaf:name> ?y }. 

The "dot dot" notation takes advantage of the network 
structure of the underlying RDF data model and the 
ability to traverse that graph in any direction using 
query languages such as SPARQL. This is related to the 
"ancestor" query mechanisms found in XPath [22] and 
used in semi-structured data environments such as Lorel 

m 

Like typical object-oriented programming languages, 
"dot" notation can be used to invoke an object's method. 
For example, suppose the makeEnemy method declara- 
tion for lanl:Person. The purpose of this method is 
to remove a foaf:knows relationships between the exe- 
cuting object (i.e. this) and the provided lanl : Person 
parameter p. Thus, 

lanl : marko . makeEnemy (lanl : dr_wh) ; 

executes the following SPARQL /Update command: 

DELETE { 

<lanl:marko> <foaf:knows> <lanl:dr_wh> }. 

In Neno, the "dot dot" notation can also be applied to 
methods, and in such cases, it is called inverse method 
invocation. Inverse method invocation can be used to 
remove all the foaf:knows relations between those 
lanl: Persons that lanl: marko foaf: knows and 
lanl : dr_wh. In other words, all of Marko's friends can 
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be instructed to make enemies with Dr. Wh. In Neno 
syntax, this is represented as 

lanl :marko . . f oaf : knows . makeEnemy (lanl : dr_wh) ; 

While Neno has many similarities to typical object- 
oriented programming languages such as Java and C++, 
perhaps the most interesting aspect of Neno's program- 
ming constructs is the way in which it takes advantage of 
the underlying RDF representation of its instantiated ob- 
jects. For a more in-depth review of the Neno program- 
ming languages which includes discussion of looping, 
branching, as well as object construction and destruction, 
please refer to ft5l . Finally, note that Section |3.2.1 will 
discuss compiled Neno code and its representation in 
RDF. 

2.2 Related Work 

The ideas behind Neno come from a longline of Web- 
based process models. This subsection will provide a 
review of related work in this area with particular focus 
on other RDF programming languages, RDF toolkits, 
RDF-to-object mappers, and finally, other popular pro- 
cess description mechanisms for the Web. 

2.2. 1 Other RDF Programming Languages 
Other known RDF programming languages include Rip- 
ple d, FABL lH, Adenosine, and Adenine E2. These 
languages have a similar philosophy to Neno in that 
that they are motivated by the desire to encode both 
data and process information within RDF and thus, take 
unique advantage of the Semantic Web infrastructure. 
The general theme behind all of these languages is to 
turn the Web into a distributed computing environment. 

Ripple is a declarative programming language aimed 
at Semantic Web mashups and scripting applications. 
In Ripple, human-readable programs expressed in a 
Notation3-like [24J serialization language are translated 
to and from RDF computing instructions in the form of 
linked RDF lists. The RDF lists which make up a Ripple 
script are intended to reside in the Semantic Web itself 
and thus, are at the same level of abstraction as the data 
they operate upon. 

Ripple is particularly useful for path-based traversals. 
For example, the Ripple query 

krs:josh foaf: knows! foaf :narae! 

yields the name of all of the individuals that 
krs:josh knows. For example, consider the RDF 
graph illustrated in Figure [2] The above Ripple query 
has the effect of pushing both "marko" AA xsd: string 
and "gary" AA xsd: string onto the Ripple 
RVM stack(^] Once "marko" AA xsd: string and 
"gary" AA xsd: string are on the stack, other 
operations can be performed on that data. 

10. The term RVM was introduced in |15| and refers to a virtual ma- 
chine that processes RDF instructions. However, unlike the languages 
presented here, the Fhat RVM of ]15| was also encoded in RDF. For 
the purpose of this article, both RDF and non-RDF represented virtual 
machines that process RDF instructions are called RVMs. 
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Fig. 2. An example RDF graph. 



An interesting aspect of the Ripple language is its 
ability to "walk" an RDF graph in a recursive fashion. 
For example, the following Ripple query 

krs:josh foaf: knows* foaf: name! 

yields Josh's name, the names of those known by 
Josh, and so on, recursively. Figure [3] diagrams the ex- 
plicit RDF representation of this Ripple program, where 
_: LI, _: L2, _: L3, _: L4, and _: L5 are blank nodes of 

rdf:type rdf:List. 

( rdf:List ) 
T 

rdf:type 



( _:L1 ) - rdf:rest 

I ▼ 

rdf:first f _. L2 j_ rdf:rest 

I / I i 

( krs:jOSh ] rdfrfirst ^ ^ y_ 



I 

(foaf:knows) rdf .| irst 



rdf:rest 
\ 

\ _:L4 |— rdfirest 



(stack:starApply ] rdf: [' r st [ _ : [_5 j 
[ foaf:name ) rdf | rst 



f" stack:apply ) 

Fig. 3. An RDF representation of a compiled Ripple 
program. 

FABL is an object-oriented language that has some 
similarities in syntax to JavaScript, compiles down to 
RDF computing instructions, and is executed by the 
FABL RVM. Like Neno, the native objects of FABL 
are RDF resources, however, the classes of FABL are 
DAML+OIL classes |25l. An example of FABL code is: 

allocate ( ' f oaf : knows ' , Property) ; 

class ("foaf :Person") ; 
restrict foaf:knows 

{ allValuesFrom foaf :Person} 
endClass ( ) ; 

boolean function isFriend ( foaf : Person p, 
foaf :Person q) { 
return contains (p . . f oaf : knows, q) ; 

} 
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The "dot dot" notation in the isFriend method iterates 
over all the foaf: knows properties as properties are 
treated as sequences of values. 

Adenosine is described by its originator as 

[...] a language designed both to work on, and 
be distributed over, the Semantic Web. The lan- 
guage exploits the expressiveness of RDF whilst 
adopting a clean syntax based on a combination 
of Notation3 and ECMAScriptj^] 

An example of the Adenosine syntax is: 

lanl : isFriend a std:Method; 
std:onClass lanl:Person; 
std : function lanl : isFriendFunction . 

@function lanl : isFriendFunction (pi , p2) { 
return p2 in pi . foaf : knows ; 

} 

Adenosine RDF code is executed by an RVM called 
Callaghan. 

Finally, the isFriend method is demonstrated in 
Adenine. Note that both Adenine and Adenosine have 
a similar syntax, and in fact, Adenosine was developed 
after Adenine in order to provide (as decided by the 
designer of Adenosine) a better syntax. Hence the similar 
names that the two languages have. 

add { lanl: Person 

rdf s : subClassOf foaf :Person ; 

} 

add { foaf: knows 

rdf: type rdf: Property ; 
rdfs:domain foaf :Person ; 
rdfs: range foaf :Person ; 

} 

method lanl : isFriend pi p2 

return (contains pi foaf: knows p2) 

2.2.2 RDF Toolkits and RDF-to-Object Mappers 

The previously presented RDF programming languages 
serve a different purpose than RDF toolkits such as 
Jena |2 6P) Sesame E71 
RDFLibFTand Pyrpl^ 



Redland |280 RDFStora^] 
The purpose of these toolkits 
is to provide a mechanism by which RDF data can be 
accessed and manipulated through the constructs of a 
specific non-RDF-based programming language such as 
Java, C, PHP, Perl, Python, and/or Ruby. The difference 



11. Adenosine was previously available at |http://www. 
netalleynetworks.com/community/igeldart/ research/ adenosine/ 



12. Jena is currently available at http:/ /jena.sourceforge.net/ 

13. OpenRDF is currently available at http://www.openrdf.org/ 

14. Redland is currently available http: / /librdf .o rg/ 



between these languages and the RDF-based program- 
ming languages presented previous are that RDF pro- 
gramming languages are designed specifically to work 
with RDF and as such, can provide 

• type-checking at the RDF level, 

• data and process encapsulation, 

• language operators to deal specifically with the RDF 
data model, 

• no impedance mismatch between RDF and the ma- 
nipulating language, and of specific differentiation, 

• can provide a representation of the procedural in- 
formation within RDF. 

RDF-to-object data mappers are related to RDF toolkits 
in that they aid a developer in utilizing RDF data in a 
programming language environment. Example RDF-to- 
object data mappers include Schemagerj^j Elmcp^] Frege 
E91 , and ActiveRDF [30 J. The purpose of an RDF-to- 
object data mapper is to alleviate the issues surrounding 
the impedance mismatch between the RDF data model 
and typical object-oriented data models. This is accom- 
plished by 1.) automatically generating class definition 
in the non-RDF language that can interact with an RDF 
representation and 2.) automatically populate these ob- 
jects using RDF data. With RDF-to-object mapping, what 
is preserved in the Semantic Web is the description of 
the data contained in an object (i.e. object fields), not an 
explicit representation of the object's process information 
(i.e. object methods). By explicitly encoding method data, 
the Semantic Web contains all the information required 
to retrieve and execute the behaviors of the object. In 
this way, with RDF programming languages, they do 
not require a separate, non-RDF programming environ- 
ment to function. Moreover, the difficulties associated 
with RDF-to-object mappings [31 1, [30 1 are not present 
in RDF-based programming languages. Some of the 
distinctions between the semantics of oriented-oriented 
programming languages and the semantics of RDF are 
made salient when understanding the distinction be- 
tween frame-based languages and ontological languages 
such as OWL 



2.2.3 Other Web-Based Process Descriptions 
Rule-based markup languages, designed for the World 
Wide Web and the Semantic Web, are a related area 
of research. Similar to RDF programming languages, 
the intent of this research is to formalize computing 
instructions within the data repository itself, whether 
that data repository be the World Wide Web and /or 
the Semantic Web. The standardization group focused 
on the W3C RuleMlj^] initiative have developed XML 
languages for encoding process information as well as 
translators for representing this information in RDF 1 33 1 . 
Particular variants of RuleML include a first order logic 
language (FOL-ML) [34] and an object-oriented language 



15. RDFStore currently available at http:/ /rdfstore. sourceforge.net/ 

16. RDFLib is currently available at http://rdflib.net/ 



17. Pyrple is currently available at http://infomesh.net/pyrple/ 



18. Schemagen is currently available at http://jena.sourceforge.net/ 

19. Elmo is currently available at http://www.openrdf.org/ 

20. RuleML is currently available at |http:/7w ww.ruleml.org/ 
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(OO-ML) 113. To provide an example of FOL-ML, the 
following logic statement 

Vx. [knows (marko, x) knows (dr_wh, x)]. 

states that for all the people that Marko knows, Dr. 
Wh also knows those people. This statement can be 
represented In FOL-ML as 

<Forall> 

<Var>x</Var> 
<Implies> 
<Atom> 
<oid> 

<Ind uri="lanl :marko"/> 

</oid> 
<slot> 

<Ind uri=" f oaf : knows "> 
<Var>x</Var> 
</slot> 
</Atom> 
<Atom> 
<oid> 

<Ind uri=" lanl : dr_wh" /> 
</oid> 
<slot> 

<Ind uri=" f oaf : knows "> 
<Var>x</Var> 
</slot> 
</Atom> 
</ Implies> 
</Forall> 

The various RuleML reasoners serve as the virtual 
machines that execute RuleML documents. For example, 
jDrew and its object oriented variant OO jDrew [36] 
are deductive reasoning engines for RuleML^] With 
respect to rule-based systems designed specifically for 
the Semantic Web, there currently exists the Semantic 
Web Rule Language (SWRL) l37l which found its roots in 
RIF (Rule Interchange Format) |38|. The Pellet reasoner 
currently supports SWRL rules 1 39 Moreover, the Pel- 
let reasoner along with other description logic reasoners 
can execute the reasoning rules of the OWL language. In 
this respect, the OWL language is a process description, 
albeit, it is not Turing complete. Also, there exist the 
Euler proof mechanism^] for reasoning and Cwrrj^] for 
general-purpose data processing on the Semantic Web. 
Finally, another area where process information is specif- 
ically encoded in the Semantic Web is the OWL-S service 
description framework [40J. 

3 RDF Instructions and the RVM 

As the Semantic Web is simply a data structure and does 
not, in and of itself, have the ability to compute, external 

21. jDrew is currently available at http:/ / www.jdrew.org/ 

22. Pellet is currently available at http://pellet.owldl.com/ 

23. Euler is currently available at http:/ / www.agfa.com/w3c/euler 

24. Cwm is currently available at http://www.w3.org/2000/10/ 
|swap/ doc/ cwm| 



machines are required to manipulate its state by adding 
and removing triples. Even when state transition rules 
are explicitly encoded in the Semantic Web as machine 
instructions, there must still exist a computing machine 
that is able to process those instructions. 

3.1 An Introduction to Virtual Machines 

A virtual machine is a computing machine represented 
in software as opposed to a hardware (e.g. logic gates) 
[41]. There are many machines that fit this description 
and range in complexity from low-level VHDL ma- 
chines Il42l to the high-level interpreters of scripting 
languages such as Perl, JavaScript, and Python. Perhaps 
the most popular virtual machine is the Java virtual 
machine (JVM) of the Java programming environment 
[43]. There are two primary components to the Java 
environment: the Java compiler (i.e. javac) and the 
JVM (i.e java). The Java compiler translates human 
readable / writeable Java source code into Java byte-code 
(e.g. javac Person, java — > Person . class). Java 
byte-code is executed by the JVM (e.g. java Person). 
Each byte-code instruction alters the state of the JVM, 
whereby new variables are declared, changed, and ulti- 
mately carry out a user-defined computation. 

3.2 An Introduction to RVMs 

An RVM is any virtual machine that interprets RDF 
computing instructions. Like typical virtual machines, 
RVMs can vary in the degree of detail that they formally 
represent. In the Ripple environment, the Ripple RVM's 
state and process rules are implemented in the Java 
language [18]. Thus, the Ripple RVM runs on the JVM0 
In the Neno/Fhat environment, the Fhat RVM represents 
its state in RDF and the process by which that state is 
altered in Lisp IIT5I . Thus, in Neno/Fhat, not only are 
the RDF computing instructions encoded in the Semantic 
Web, but so is the state of the RVM (i.e. its stacks, frames, 
program counter, etc.). The purpose of encoding an RVM 
state in RDF is to migrate RVMs between physical ma- 
chines (refer to Section |4~2) . However, note that there will 
always be a level of indirection in which computation is 
moved out of the Semantic Web to the physical hardware 
which supports it. In the end, it is the physical hardware 
that changes the state of the Semantic Web. Moreover, 
it is the laws of physics that drive the evolution of a 
hardware processor. Thus, in order to compute, every 
level of process abstraction must be grounded in (or 
founded on) some physical process. 

An RVM has four primary components: the RDF 
computing instructions, the RVM state, the RVM process, 
and a triple store or web server interface]^] All of these 

25. It is possible, given that Ripple is Turing complete, to build a 
completely RDF-based virtual machine in Ripple and thus, encode 
both Ripple programs and the Ripple RVM in the Semantic Web as 
RDF computing instructions. This is also possible with the other RDF 
programming languages as they are all Turing complete. 

26. The concepts presented in this subsection deal specifically with 
triple store interfaces only. 
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components are diagrammed in Figure |4j where RI 
represents RDF computing instructions, RS represents 
the RVM state in RDF, and D represents other, non- 
procedural triples in the triple store (i.e. other RDF data). 



127.0.0.1 



Triple Store 



RVM 



HE 



RDF Instructions 
RVM State 



RVM Process 



Triple Store Interface 



Fig. 4. The components of an RVM-based computer. The 
RDF instructions and RVM state are boxed in dotted lines 
to signify that they can be represented in RDF and thus, 
able to be placed in a triple store. The double-arrowed line 
connecting the triple store interface to the triple store is a 
read/write/delete protocol such as SPARQL/Update. 

The graph of RDF computing instructions that the 
RVM interprets dictates the evolution of the RVM's state: 
the instruction being executed, the state of the heap, 
the state of all the stacks, etc. Because the RVM state 
can be represented in RDF, it can be distributed in the 
same manner as any other RDF data (e.g. as a set of 
statements in an RDF triple store or as an RDF document 
on the Web). The role of the RVM process, on the other 
hand, is to manipulate the RVM state and thereby carry 
out a computation. RDF data is simply a data structure. 
Whether that structure includes static or procedural in- 
formation does not endow it with the ability to compute. 
In order for that structure to evolve, it relies on some 
external process to read, write, and delete triples from 
it. Thus, in order to fully compute, an RVM state relies on 
an RVM process to alter it. It is through the triple store 
interface that an RVM process is able to query (i.e. read) 
and alter (i.e. write and delete) an RVM state, computing 
instructions, and other RDF data. In Figure |4j the double- 
arrowed line between the triple store interface and the 
triple store denotes a read / write / delete protocol such as 
SPARQL/Update lETl . 

With respect to Neno/Fhat and the other related RDF 
programming environments (i.e. Ripple, FABL, Adeno- 
sine, and Adenine), RI and RS are located at different 
levels of abstraction]^] These differences are articulated 
in the following itemization. 

• Other: only D and RI are in the triple store. RS is 
represented in local memory. 

• r-Fhat: D is in the triple store, but RI and RS can 
move between the triple store and the local memory. 

• Fhat: D, RI, and RS are all contained in the triple 
store. 

27. The term "other" is used to denote programming environments 
other than Neno/Fhat. 



Theoretically it is possible to both read RDF comput- 
ing instructions and change the RVM state while it is 
represented in the triple store, as in Fhat. However, due 
to the read / write overhead incurred by such a model, it 
is preferable to move the instructions and RVM state to 
local memory for processing. In the Neno/Fhat environ- 
ment, this is accomplished through the r-Fhat RVMp*] 
Also, in Ripple, computing instructions are moved to 
local memory to increase processing speed. 

The remainder of this section will discuss the architec- 
ture and instruction set of the Fhat RVM. 

3.2. 1 The Fhat RDF Virtual Machine 
The architecture of the Fhat virtual machine is defined 
in OWL W\ This architecture provides an abstract descrip- 
tion of an instance of a Fhat RVM. Figure [5] diagrams the 
types of resources and relationships present in the Fhat 
RVM architecture. 




|m [Q-ij [1] | 

| xsd:s1ring~ |rdfs:Resource j Block ~j 



Fig. 5. The classes that compose the Fhat RVM. 
The dashed lines denote the rdf s : subciassOf prop- 
erty and the bracketed values (e.g. [0..1]) denote 
the cardinality owl : Restrictions on particular prop- 
erties. All non-namespaced URIs are part of the 

http : / /neno . lanl . gov namespace. 

The Fhat RVM was designed to work with a predeter- 
mined set of instructions known as the Fhat instruction 
set. Figure [6] diagrams some of the more important 
instructions supported by the Fhat RVM. 

The following itemization presents a few of the more 
common instructions in the Fhat instruction set and their 
relationship to the Fhat RVM[^] 

• Instruction: A Fhat RVM instance has a 
programLocation pointer (i.e. PC) to the cur- 
rent Instruction that it is processing. If no such 

28. r-Fhat stands for "reduced Fhat" as it reduces the amount of 
RDF data being processed by representing the RVM state in the 
programming constructs of the RVM process. 

29. The Fhat RVM architecture and instruction set are currently 
available at http://markorodriguez.com/docs/nenoDoc/ 

30. To preserve the readability of this article, the namespace prefix 
for the courier font URIs is assumed to be http : / /neno . lanl . gov. 
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Fig. 6. The instruction set of the Fhat RVM. 
The dashed lines denote the rdf s : subciassOf 
or rdf s : subPropertyOf properties and the 
bracketed values (e.g. [0..1]) denote the cardinality 
owl : Restrictions on particular properties. 
All non-namespaced URIs are part of the 

http : //neno . lanl . gov namespace. 



programLocation exists, then the Fhat RVM can 
not compute. 

• PushValue: used to push resources on to the 
OperandStack. For example, to push the floating 
point value 2.65 on the stack for a later operation. 

• Arithmetic: various subclass instructions include 
Add, Subtract, Multiply, and Divide. These 
pop two values off the top of the OperandStack, 
perform the specified operation, and push the com- 
puted value back on the OperandStack. 

• Invoke: initializes a Frame for a Method. The 
Frame contains the names (hasSymbol), values 
(hasValue), and scopes (fromBlock) of the local 
Variables of a Method. 

• Setter: used to assign a value to a Variable 
in the Frame of a Method. SetClear, SetMinus, 
Set, and SetPlus are subclasses of Setter. 

• Return: pushes the return value on the 
OperandStack and sets the programLocation to 
the Instruction popped off the ReturnStack. 
This instruction is used to return from a method. 

When Neno source code is compiled using the 
Neno/Fhat compiler, a Fhat OWL API is generated. This 
API provides an abstract representation of a Neno object 
(known as a Neno class), its fields, its methods, and its 
method's instructions. The API representation incorpo- 
rates many owl : Restrictions to ensure that when 
a Neno object is instantiated, there is an unambiguous 
generation of instance-level RDF instructions. For ex- 



ample, Figure [7] demonstrates how owl : Restrictions 
are used to define the relationship between instructions 
within a method bodypl 



c 



nextlnst 



owkonProperty 
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owkRestriction 



owl:onProperty 
owl:on Property 

owkRestriction 
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.:A3 



owliallValuesFrom 



owl:allValuesFrom 



urn:uuid:1000 j f urn:uuid:1Q01 j f urn:uuid:1010 j 



PushValue 



PushValue 



1 Multiply j 



Fig. 7. An snippet of a Fhat API instruction sequence. 
The dashed lines represent the rdf s : subciassOf 
property. Note that other owl : Restrictions beside 
_:Ai, _:A2, and _:A3 are not presented. For exam- 
ple, a PushValue instruction requires a value to push 
onto the operand stack. What is presented demon- 
strates how the sequence of instructions is fixed using 
owl : Restrictions. All non-namespaced URIs are part 
of the http : //neno . lanl . gov namespace. 

From a Fhat API, it is possible to instantiate Neno 
objects and their methods. Figure [8] diagrams an RDF 
instance of the lanl : Person makeFriend method pre- 
viously presented in Section 



makeFriend (lanl : Person p) { 
this . f oaf : knows =+ p; 

} 

In Figure |8j the makeFriend method is associated 
with a particular lanl: Person, namely lanl:marko. 
Methods, like RDF properties, are not dependent upon 
the classes that utilize them in their description. Thus, 
with Neno it is possible for many objects to share the 
same method description, or given the requirements 
of the computation being executed, it is possible for 
each object instance to have a unique method instance. 
The latter is desirable when migrating objects between 
different triple store environments. 

Finally, to provide another example of an RDF instruc- 
tion sequence, Figure [9] diagrams a simple arithmetic 
instruction sequence that computes x = 1 + (2 X 3). 

4 Models of Computing on the Semantic 
Web 

The RVM architecture described in the previous section 
opens up a number of common computing models to 

31. The Neno/Fhat compiler uses Universally Unique Identifiers 
(UUID) when minting instruction URIs |44). UUIDs are 32-bit iden- 
tifiers. For diagram clarity, only a few characters of a UUID are 
presented. 
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Fig. 8. The instance level RDF representation of the 
makeFriend method of a lanl: Person class. The 
bolded terms above the resources denote the rdf :type 
of the resource. Note that these rdf: types are in- 
ferred types as the direct type is a minted UUID 
with specific owl : Restrictions as demonstrated in 
Figure [7] All non-namespaced URIs are part of the 
http : //neno . lanl . gov namespace. 
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Fig. 9. An instance of a set of instructions that will 
set the variable x to the value 1 + (2 x 3). The bolded 
terms above the resources denote the rdf: type of 
the resource. All non-namespaced URIs are part of the 

http : //neno . lanl . gov namespace. 



the Semantic Web. The following three models will be 
discussed throughout the remainder of this section. 

• Open computing: an extension of Open Data in 
which algorithms, virtualized computing machines, 
and underlying hardware computing resources are 



made publicly available]^] 

• Distributed computing: a means by which pro- 
cesses are moved to the data, as opposed to the data 
to the processes. 

• Reflective computing: as computational process de- 
scriptions reside in the URI address space, reflection 
from the API to the RVM is possible. 

4.1 Open Computing 

The Open Data movement has "a philosophy and prac- 
tice requiring that certain data are freely available to 
everyone, without restrictions from copyright, patents 
or other mechanisms of control. Its ethos is similar to 
that of other open movements and communities such as 
Open Source and Open Access.']^] 

By Open Computing, we understand that 

• RDF computing instructions should be made freely 
available and easily accessible for code reuse, and 

• results of popular computations should be made 
publicly available. 

4.1.1 Towards a Web of Programs 

A key advantage of the RDF data model is that in using 
URIs to denote resources, RDF makes resource descrip- 
tions distributable in a way which leverages the existing 
infrastructure of the Web. For example, suppose that a 
Semantic Web application encounters the following URI: 

\protect \ vrule widthOpt \protect\href { http : / /www4 . wiwiss . f u-ber 

If the application is capable of dereferencing the URI 
over HTTP or querying on it through a SPARQL end- 
point, it will find RDF statements about Nepal, including 
demographic and geographical data. The application 
may then use these statements to solve problems. The 
practice of serving an RDF representation of a resource 
against its URI, as well as establishing links between 
such URIs (for example, owl : sameAs links) is known as 
Linked Data. Thus, Linked Data is the RDF equivalent 
of the interlinked hypertext documents which make up 
the bulk of today's Web. Furthermore, it is the mech- 
anism by which physically isolated RDF data sets are 
amalgamated into a truly global Web of data. 

The distributed nature of Linked Data provides a 
strong argument for the representation of programs and 
program state in RDF. Embedding data structures and 
algorithms in the web of Linked Data not only makes 
them universally available, but also eliminates the need 
for special-purpose software to retrieve and combine 
programs which reference each other across the under- 
lying physical network. Effectively, generic Linked Data 
interfaces such as the Semantic Web Client Libraryp] 
serve as language-agnostic program linkers, aggregating 

32. In another context, the term Open Computing refers to services 
which allow people to freely use computers in a lab setting. This is 
not the definition that is used here. 

33. Quoted from the Wikipedia article on Open Data at 

http : // en . wikipedia . org/wiki/Open_Data. 



34. Semantic Web Client Library is currently available at http:/ / sites. 
wiwiss.fu-berlin.de/ suhl/bizer/ ng4j/ semwebclient/ 
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procedural as well as purely descriptive RDF data dy- 
namically, as needed in a computation. The development 
of RDF programming languages were motivated by this 
idea of a "web of programs". In RDF programming 
languages, programs become "first-class" entities in the 
Linked Data community's distributed graph data struc- 
ture. Incorporating them into further programs is as 
simple as referencing their URIs. 

Similarly, as an alternative to what might be called 
"simply" linked programs, encapsulating a computa- 
tional object in a named graph (as demonstrated later in 
Section |4.2.1 and Figure [TT) makes those objects available 
to any application with knowledge of and access to the 
graph. To reference a computational object, an applica- 
tion must be able to infer from the object the URI of the 
named graph which describes it, as well as the location 
of a SPARQL endpoint from which it can retrieve the 
named graph. Emerging solutions such as the Semantic 
Web Crawling Sitemap Extension^] aid in making this 
information discoverable by Semantic Web applications. 

Finally, in publishing an RDF program to the Semantic 
Web, it is good practice to provide documentation of the 
API. Furthermore, it is natural to express Semantic Web 
API descriptions in RDF, for instance as OWL ontologies. 
Like the JavaDoc framework, OWLDocj^] is a useful 
aid to developers in learning an OWL API and using 
it in their applications. The combination of machine- 
accessible program code and API documentation is par- 
ticularly appropriate for application scenarios involving 
the automated discovery and execution of programs. 



4. 1.2 Memoization and Computational Reuse 

In some situations, it is best to query for the result 
of a previous computation than to re-compute it. This 
idea is known as memoization J45), and the Semantic 
Web and its open data philosophy provides an ideal 
medium for such computational reuse. Consider the 
simple function / : N — > N, where f(n) = n + 1. 
If /(5) has been previously computed, the result can 
be represented by the RDF triple (5, f , 6)p^| The results 
of that computation can be reused by another RVM 
at a later time. Memoization sacrifices space for time. 
Of course, this is an impractical example, because re- 
computing /(5) is faster than querying the Semantic 
Web for the mapping. However, for other, more com- 
putationally complex operations, querying for a result 
may be orders of magnitude faster than recomputing 
it. For example, many graph analysis algorithms have a 
relatively high complexity, such as PageRank (or eigen- 
vector centrality) — O(EI), closeness centrality — 0(N 2 ), 



35. The Semantic Web Crawling Sitemap Extension is currently 
available at http://sw.deri.org/2007/07/sitemapextension/ 

36. OWLDoc is currently available at http://www.co-ode.org/ 
downloads/ owldoc/ 

37. For the purpose of this simple example, the caveat that RDF does 
not allow a literal to be the subject of a triple is ignored. 



and betweenness centrality — 0(7Vi?p] where N is the 
number of vertices in the graph, E is the number of 
edges in the graph, and / is the number of iterations. 
Such algorithmic complexity become important when 
considering the use of graph analysis algorithms on the 
Semantic Web 1471 , [48 1. The Semantic Web provides a 
unique medium by which computations such as these 
can be stored and later leveraged by other applications. 
In this sense, not only is metadata open, but so are 
computational results. 

4.2 Distributed Computing 

Virtual machine computing provides a layer of abstrac- 
tion between program instructions and the underlying 
hardware CPU. It is the role of the virtual machine to 
serve as a proxy to translate high-level instructions into 
the respective instruction set of the underlying CPU. 
While this indirection slows the computation down by 
incurring a translation step, it permits the same high- 
level instructions to execute on various physical hard- 
ware architectures. This idea is captured in the popular 
Java slogan of "write once, run anywhere." In the case 
of RVMs, this interoperability hides the underlying hard- 
ware infrastructure supporting the Semantic Web. 

As RDF-based data sets become larger and more nu- 
merous, the Semantic Web will be composed of more 
large-scale RDF repositories serving Linked Data to 
third-party applications. However, some applications 
may draw upon more data than can reasonably be 
moved from server to client. In such instances, it may 
be worthwhile to migrate an executing program, in 
the form of RVM state and RDF instructions, to the 
provider's environment for local processing. The notion 
of process migration has been proposed to remedy issues 
surrounding massive-scale data processing [49 J, [50J and 
forms one of the primary purposes of Grid computing 

m 

The remainder of this section illustrates one possible 
mechanism for the migration of an RVM across different 
hardware hosts in order to accomplish a distributed 
computation within the Semantic Web infrastructure. 
However, before doing so, a discussion of the role of 
named graphs in distributed Semantic Web computing 
is required. 

4.2. 1 The Role Named Graphs 

In a Semantic Web computing environment where in- 
structions, virtual machines, and data commingle within 
a single RDF data structure, there is an increased need 
for trust, security, and provenance mechanisms. For ex- 
ample, it may be necessary to 

• group RVM states and RDF instructions for ease of 
migration and identification, 

38. This complexity is for unweighted betweenness centrality. 
Weighted betweenness centrality has a complexity of 0(NE + 
N 2 log N) . See (46 1 for more information on betweenness centrality. 
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• "sandbox" RVMs and RDF instructions to prevent 
malicious or poorly written code from destroying a 
triple store's data integrity and 

• control the permissions that foreign RVMs and RDF 
instructions have in a triple store environment. 

The most fundamental construct supporting the above 
three requirements is RDF reification. Reification pro- 
vides a way to make statements about statements. In 
RDF, a reified triple may be the subject or object of 
another triple. While the concept of RDF reification 
was initially introduced in the RDF specification with 
the rdf : Statement construct, recent developments in 
named graphs (or quads) provide a more manageable 
solution to triple reification ||5TI . 

In a named graph, a "triple" is an ordered set of 
four elements (s,p, o, g)f®\ The g URI, which represents 
a named graph, can be used as the subject or object 
of another statement and thus serves as a mechanism 
for attaching metadata to the graph. This metadata 
may include usage statistics, human-level descriptions 
(e.g. rdf s : comment), access control permissions |52|, 
and / or provenance information. 

Figure 10 demonstrates how a named graph can en- 
capsulate data, state, and instructions. It is possible to 



The Neno/Fhat programming environment uses 
named graphs to encapsulate computational objects 
for ease of migration, code sharing, and internally 
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for garbage collection. For example, in Figure 
lanl:marko is a URI; however, at a higher-level of 
abstraction, lanl:marko is a graph-based object with 
method declarations and explicit method instructions. 
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Fig. 11. Using named graphs to encapsulate 
computational objects. All non-namespaced URIs 
are part of an example RVM namespace denoted 

http : / / example . com/ rvm. 
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Fig. 10. Using named graphs to encapsulate RVM state 
and RDF computing instructions. The rdf : type of the 
RVM state resources is provided in bold next to the re- 
source. All non-namespaced URIs are part of an example 
RVM namespace denoted http: / /example . com/rvm. 

attach security metadata to named graph B such that 
the RVM and RDF computing instructions contained in 
it have specific permissions with respect to the triples 
in named graph A. It is also straightforward to extract 
the RVM state and its current instructions by simply 
selecting triples from named graph B. For example, the 
query 

SELECT ?x ?y ?z 
WHERE { 

GRAPH <B> { 
?x ?y ?z } } 

yields the entire B graph and thus, both the RDF com- 
puting instructions and the RVM state. 

39. While such triples are actually a quads, the term "triple" will be 
used as this is a popular convention. 



Section 4.2.2 will now discuss distributed computing 
on the Semantic Web using named graphs. 

4.2.2 RVM Compute Farms 

An open hardware provider may use an RVM farm 
to manage concurrent RVMs computing on the local 
triple store. For example, a Linked Data repository may 
provide an RVM farm that is used to execute RVMs and 
instructions that are working with the data currently 
in its repository. An RVM farm polls its associated 
triple store for non-executing RVM states. Once a non- 
executing state has been found, the RVM farm will 
spawn an RVM process to execute it. For instance, an 
RVM farm might use the query 

SELECT ?x 
WHERE { 

?x <rdf:type> <rvm:RVM> . 

?x <rvm: needsProcess> "true" * ~xsd : boolean } 

to locate RVM states that need processing. If a URI binds 
to ?x, an RVM process is created. The newly created 
RVM process is passed the bound ? x URI as a parameter. 
Thus, the newly created RVM process knows which RVM 
state to harvest and execute. An RVM process maintains 
no state information and thus, if an RVM process halts 
(for whatever reason), the current state of computation 
is maintained in the RVM state. That is, the state of the 
stacks, program counter, etc. are frozen until another 
RVM process can continue its execution. In this way, 
RVM state encoding makes it desirable for migration 
between triple store environments, and thus, RVM farms. 



Figure 12 diagrams a migration pattern between two 
triple store environments, where one environment is 
located at 12 7.0.0.1 and the other is located at 
127 .0.0.2. 
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Fig. 12. Migrating RVM and RDF computing instructions 
between host triple stores. 



Starting at t = 1 in Figure [12] (top left corner), an RVM 
state and RDF instructions named graph (denoted R/l) 
exist in the 12 7.0.0.1 triple store. At t = 2, the local 
RVM farm locates the RVM state and spawns a new RVM 
process. The RVM process moves the RVM state and 
instructions to local memory to perform a computationp^j 
Assume that the RVM uses the data in the named graph 
denoted Dl in its computation. At t = 3 the RVM has 
finished its computation with D 1 and inserts its state and 
instructions back into the 12 7.0.0.1 triple store. The 
RVM process at 127.0.0.1 notifies the RVM farm at 
127.0.0.2 that it has an RVM that wishes to migrate 
to its triple store. The RVM farm at 127.0.0.2 then 
harvests the RVM and its intructions using a SPARQL 
SELECT query or HTTP GET request]^] Thus, what is 
migrated is the RDF instructions as well as the state of 
the RVM which includes, amongst other information, a 
populated heap that it is using in its computation. 

At t = 4, the R/I named graph is located in the 
triple store at 12 7.0.0.2. At t = 5, the RVM farm 
at 127.0.0.2 locates R/l and then spawns an RVM 
process. The newly created RVM process moves the 
RVM state and instructions to main memory for local 
processing on the D2 named graph data set. When the 
RVM no longer requires D 2 for its computation, it can be 
moved back to the 12 7.0.0.2 triple store at t = 6 and 
ultimately, migrate to yet another triple store at t = 7. 

Such a model of distributed computing is dependent 
on mechanisms of trust and security on the Semantic 
Web. A simple solution would be to allow a foreign 
RVM to read, write, or delete from its own named graph 
and any named graphs that it spawns, but to allow it 



40. Moving the RVM state and instructions to memory ensure a 
more efficient use of clock cycles. This was articulated previously when 
discussing the r-Fhat RVM. 

41. Another approach would be to have the RVM process at 
127.0.0.1 INSERT/ PUT the RVM state and instructions into the 
triple store at 127.0.0.2. 



only to read from other named graphs in the triple store. 
Furthermore, limiting the number of triples in a named 
graph can prevent the creation of an excessive amount of 
data by a foreign RVM. Finally, in Neno/Fhat, a simple 
"halt" mechanism regulates the number of clock cycles 
that an RVM state can utilize. 

4.3 Reflective Computing 

The concept of reflection in computing refers to the 
ability for software to modify itself during runtime |53|, 
1 54]. For instance, in the Java programming language, it 
is possible for an object to "look at" the API at runtime 
and make choices as to flow of execution. This is made 
possible through the java . lang . reflect package of 
the core Java API. This type of reflection exists because 
the description of an object is at the same level of abstrac- 
tion as the object itself. In general, reflective computing 
is possible when particular states of a program are made 
available within a representation that is processable 
by the executing program. In many respects, reflective 
computing draws many parallels to the common model 
of reification in logic, and specifically in RDF using 
named graphs or the rdf : Statement construct. With 
reification, it is possible to make a statement about 
a statement. With reflection, it is possible to compute 
with the description of a computation - whether that 
description is of a particular state of computation or of 
the program itself [55 J. 

Reflective computing has found application in het- 
erogeneous object environments where an object may 
need to discover new objects and "learn" what func- 
tionality they have or reason about their functionality 
before leveraging them within a computation. OWL-S 
[40] and other web service description frameworks serve 
a similar purpose in that they provide detailed machine- 
readable descriptions of the input requirements, pro- 
cessing stages, and ultimate output of a service. With 
respect to RDF-based programming languages and RDF- 
encoded RVMs, both procedural and machine informa- 
tion are encoded at the same level of abstraction, namely 
in RDF, thus making them readily available for run- 
time analysis. While programming constructs such as 
packages, class descriptions, methods, and so forth are 
utilized for the purpose of procedural encapsulation, it 
is possible to make use of queries on and manipulations 
of an RDF graph in order to reason on and alter the 
full computing stack during run-time. This sub-section 
will present three types of reflection that are made 
salient in RDF computing: object and method reflection, 
instruction reflection, and machine reflection. 

4. 3. 1 Object and Method Reflection 

A triple store supports three basic operations: read, 
write, and delete. RDF programming languages make 
use of these operations to query and manipulate an 
RDF graph in order to evolve the RDF graph and thus, 
compute. It is possible for an RDF program to query 
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a triple store in order to retrieve information about 
the state of an object whether that object be itself or 
another object. For example, an object might execute a 
query to locate all objects of rdf :type lanl: Person 
as well as any methods associated with that object. The 
following SPARQL query returns a list of instantiated 
lanl : Person objects and their associated method URIs: 

SELECT ?x ?y 
WHERE { 

?x <rdf:type> <lanl : Person> . 
?x <rvm: hasMethod> ?y }. 

Given the URIs bound to ?x (the URI lanl: Person 
resources) and ?y (the URI of the lanl: Person meth- 
ods), the querying object can make decisions as to how 
to utilize these URIs in its processing. For example, it 
could decide that if a particular lanl:Person resource 
has a makeFriend method, then it must be a "friendly" 
object and will make friends with that object as well as 
invoke that object to make friends with it. Thus, creating 
a symmetric foaf:knows relationship. 

This type of reflection is analogous to class reflection 
in Java. For example, in Java, the previous method 
reflection query may be executing as 

Method [ ] methods = Person . getMethods () ; 

The class Person is queried for its set of methods 
which are returned as an array of Method objects. These 
Method objects can then be computed with like any 
other object. For example, 

Person marko = new Person (); 
Method [] methods = 

josh.getClass() . getMethods ( ) ; 
for (Method m : methods) { 

if (m . get Name ( ) . eguals ( "makeFriend" ) ) { 

marko .makeFriend ( josh) ; 

m. invoke ( josh, marko); 

} 

} 

Both makeFriend and invoke are presented in the last 
two instruction lines to demonstrate the two ways in 
which the same method can be executed. 



4.3.2 Instruction Reflection 

In an RDF computing environment, not only are 
methods exposed, but so are the instructions that 
composes those methods. What is returned by 
Person . getMethods ( ) in the previous example 
is an array of pointers to the methods that are available 
from that class. In this way, the program, at run- time, is 
able to inspect the Person Java API. Once this method 
pointer has been acquired, in Java, it is possible to 
invoke the method: 

Person marko = new Person (); 
methods [ ]. invoke (marko, null); 



In an RDF computing environment this is equivalent 
to adding an Invoke instruction resource as the next 
instruction in the current instruction sequence. This en- 
sures that the next instruction to be processed by the 
executing RVM will invoke the method. Assuming y 
is the URI of the ?y binding of the previous SPARQL 
select query, the following SPARQL /Update command 
will provide the appropriate alteration of the flow of 
execution: 

INSERT DATA { 

<ex:001> <rvm: nextlnst> <ex:010> . 
<ex:010> <rdf:type> <rvm:Invoke> . 
<ex:010> <rvm : invokeMethod> <y> }. 

As demonstrated, it is possible to reason about the 
current instruction sequence of a program and perhaps, 
insert new instructions as a result. This type of direct 
code manipulation supports evolutionary (or genetic) 
computing: at runtime, new code can be introduced into 
the system. Again, the runtime creation and manipula- 
tion of code is made possible by the fact that the API 
and the instructions are at the same level of abstraction: 
URIs, literals, blank nodes, and triples. 

4.3.3 Machine Reflection 

An RVM may execute instructions that manipulate itself. 
Thus, a machine can modify itself at runtime. This type 
of machine reflection is diagrammed in Figure 13 



RVM State 

OperandStack 



RVM 



ex:00 



hasOpStack 



ex:01 



PC 



t=1 



CD 

E 



OperandStack 



RVM 



ex:00 



t=2 



hasOpStack 
- rdf :first 



ex:01 



PC 



Instructions 
Push 



ex:10 



nextlnst 



ex: 11 



NoOp 



Fig. 13. RDF reflection at the RVM-level. Time is 
only specified for the RVM state, not the instruc- 
tions. The rdf: type of the resource is presented 
in bold with the resource. All non-namespaced URIs 
are part of an example RVM namespace denoted 

http : / / example . com/ rvm. 



In Figure 13 



at t = 1, the RVM (ex: 01) is pointing 
to an instruction (represented by the program counter 
PC property) that requires the RVM to push a pointer of 
itself onto its own operand stack (i.e. the Push instruc- 
tion's value is the RVM resource). At t = 2, the RVM 
URI is on the top of the operand stack (i.e. rdf : first). 
Thus, at this point, any instruction that utilizes the 
operand stack will involve the RVM computing on itself 
(in Figure 13 this next operation in a NoOp, which will 



not alter the state of the machine). In this way, it is 
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possible for the RVM to manipulate itself at runtime. 
If the machine's process is also encoded in RDF (e.g. by 
coding Fhat's process in Neno), then the complete RVM 
is subject to such machine-level reflection. In short, the 
RVM's stacks, frames, program counter, or whichever 
modeled components the RVM in question represents 
can be manipulated through machine-level reflection. 

The previous example is an extreme case of reflection 
which is not found in typical programming environ- 
ments. For instance, in Java, it is not possible for a 
program to obtain a pointer to the JVM. Furthermore, 
the JVM is represented according to the instruction set 
of the underlying hardware CPU and thus, is not at the 
same level of abstraction. This reduces the ability to facil- 
itate machine reflection. In languages which compile to 
machine-specific code, such as C and C++, it is possible 
to get a direct pointer to the program being executed. In 
this way, a program may be manipulated at runtime. 

In an RDF-based programming language, reflection 
can be applied to the entire computational stack. How- 
ever, it is still possible to engineer code that respects the 
common principles of data hiding, encapsulation, and 
modularity. In many cases, such common development 
practices are preferable. The purpose of this section 
was to demonstrate the flexibility of this RDF-based 
programming style. 

In a Semantic Web computing environment where 
object persistence is an expected feature, the ability for 
objects to discover, reason, and ultimately interact with 
other objects will be a key component of RDF-based 
applications. Moreover, with persistent RVMs, it will be 
important for RVM processes to discover and reason on 
RVM states before executing them. 

5 Conclusion 

The Semantic Web is a distributed environment in which 
descriptive world-models can be queried and manipu- 
lated by external applications. These external applica- 
tions leverage the world-wide repository of structured 
multi-relational data for the purpose of computation. 
This article has addressed the potential role of encod- 
ing such applications in the Semantic Web. Given the 
modeling power of RDF, it is possible to represent not 
only data, but also software and virtual machines in RDF. 
RDF-based programming languages were designed to 
take explicit advantage of the RDF data model. Unlike 
RDF APIs in other languages such as Java, C, etc., these 
languages do not require the developer to work with 
two different data models. There is no disjoint experience 
for the developer [19]. Furthermore, with the ability to 
encode virtual machines and their state in RDF, it is 
possible to migrate software and machines to other data 
sets around the world. This provides a distribute process 
infrastructure to the Semantic Web's distributed data 
structure. When the more procedural aspects of comput- 
ing are embedded in the Semantic Web, new computing 
models emerge that push the Semantic Web towards a 
distributed general-purpose computing environment. 
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